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ABSTRACT: The majority of nonlinear niters used in practise are stack filters. An algorithm is 
presented which calculates the output distribution of an arbitrary stack filter S from the disjunctive 
normal form (DNF) of its underlying positive Boolean function. The so called selection probabilities 
can be computed along the way. 



1 Introduction 



Stack filters were invented in 1986 and have been a key topic of research in nonlinear signal 
processing ever since. Simply put, all aspects of a stack filter are reflected in its underlying 
positive Boolean function, and a basic familiarity of the latter concept is all that is required 
to understand this article. Using Google Scholar one can easily track the literature on various 
other aspects of stack filters, e.g. their output distribution. In this article we present a new 
algorithm to calculate the output distribution. The new method, called stack filter n-algorithm, 
is an extension of the noncover n-algorithm [10] which generates, in compact form, all noncovers 
X of given sets A\, ■ ■ ■ , A* h (i.e. X 2 A* for all 1 < i < h). 

The stack filter n-algorithm is introduced by means of a medium-size example in section 2. 
Section 3 is dedicated to its theoretic assessement, to numerical experiments and to the discussion 
of an approach [8] based on binary decision diagrams. Finally in section 4 three enhancements 
are discussed. In particular the popular rank selection probabilities are treated, and stack filters 
are generalized to balanced stack filters in the sense of [9]. The present article can be viewed as 
the realization of a fifth benefit of DNF's that was announced in [11]. 



2 The stack filter n-algorithm 



Fix m > 1 and put w := 2m + 1. Let b : {0, 1} W — > {0, 1} be a positive Boolean function (PBF), 
i.e. one without negated variables. Refering to e.g. [2], an operator S from R z in itself defined 
by its k-th component being 

[Sz] k := b(Zk-m, ■ ■ ■ ,Zk, ■ ■ ■ ,Z k +m) (k & Z) (1) 
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is called a stack filter of window size w based on b. Notice that the PBF b in (1) has been 
extended from {0, 1} W — > {0, 1} to W" — > R in the usual way, i.e. by replacing the logical 
connectives A and V by the minimum respectively maximum operation for pairs of real numbers 
(while keeping the symbols). So, if 

b(x-i,x ,xi) : = ((x V xi) A x_i) V x , (x, G {0, 1}) 

then 

6(3, 2, 4) = ((2 V 4) A 3) V 2) = (4 A 3) V 2 = 3 V 2 = 3. 

By construction each stack filter S is translation invariant in the sense that pushing the series x 
ten units to the right and then applying S yields the same as first applying S and then pushing 
ten units to the right. So S is completely determined by formula (1) for k = 0. 

Let Z = (• • • , Z-i, Zq, Z%, ■ ■ •) be a doubly infinite sequence of independent indentically dis- 
tributed (i.i.d.) random variables. Let Fz(t) be their common (cumulative) distribution func- 
tion, i.e. Fz(t) := Prob(Zi < t) is the probability that Zi is at most t. By translation invariance 
the output distribution Fgzif) '■= Prob((SZ)i < t) is independent of i. It is known that there is 
a well defined function 4>s(p), called the distribution transfer of S, such that 

Fsz(t) = 4>s{Fz{t)) (t G R) (2) 
What's more, (fts(p) is a polynomial which can be calculated [12], [2, p. 223 ] as 

5 (p) = P lZeTO(x)l ■ gl° ne MI (3) 

b(x)=0 

where q := 1 — p and b is as in (1). The summation is over all bitstrings x G {0, 1} W with 
b(x) = 0, where by definition 

Zero(x) := {1 < i < w\ xi = 0}, 

One(x) := {1 < i < w\ Xj = 1}. 

The range of the index set can be any convenient finite interval of Z, it need not be {1, 2, • • • , w\ 
as above. For instance, consider this positive Boolean function which is already in disjunctive! 
normal form (DNF): 

&i(x_4, • • • , X4) = (x_2 A x_i A xo) V (x_i A xo A xi) V (xo A xi A X2) (4) 

V (x_4 A x_3 A x_2 A xi A X2 A X3) V (x_3 A x_2 A x_i A xi A X2 A X3) 

V (x_3 A x_2 A x_i A X2 A X3 A X4) 

Put W = {—4, —3, • • • , 4} and w = \W\ = 9. In view of (3) we wish to encode the family Mod 
of all x = (x_4, x_3, • • • , X4) in {0, 1} W with 6(x) = in a compact way. First note that 

Mod = Modi HMod 2 nMod 3 nMod 4 nMod 5 n Mod 6 , (5) 

"The conjunctive normal form (CNF) would serve just as well since everything to come dualizes in obvious 
ways. 
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where the family Modj corresponds to the i-th conjunction in (4). For instance we write 

Modi := {xG{0,1} h, |x_ 2 Ai-iAx = 0} = (2, 2, n, n, n, 2, 2, 2, 2) 

because x_2 A X-i A xq = (nul) if and only if at least one of X-2, x-±, xq is nul, and the 
other variables x_4, x_3, x\, X2, £3, X4 can independently assume the 2 values and 1. Thus 
(1,1,0,1,0,1,0,1,1) € Modi but (0,0,1,1,1,0,0,1,0) Modi. If we identify a 0, 1-string x 
with the subset X = {i £ W : X; L = 1} of W then Modi consists of all noncovers X of 
A* := {—2, —1,0} in the sense that X 2 -A*- The noncover n-algorithm from [10] (more details 
below) generates all simultaneous noncovers of given sets (here deriving from conjunctions of a 
PBF) A\ , A* 2 , ■ ■ ■ , A* h as follows: 
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Table 1 



By PC = 2 we mean that at this stage the pending conjunction is the second one, i.e. the one 
that defines Mod2- In other words, we need to sieve out those x € Modi that happen to be in 
Mod2 = (2, 2, 2, n, n, n, 2, 2, 2). In order to do so we determine the intersection {—2,-1,0} PI 
{-1,0,1} = {-1,0} of the "n-pools" of Modi and Mod 2 and then split the {0, 1, 2, n}-valued 
row r := Modi accordingly into a disjoint union r = r' U r" where 

r' ■= {x € r\ x_i = or x = 0} = (2, 2, 2, n, n, 2, 2, 2, 2) 
r" := {x€r|x_i = x = l} = (2,2,0,1,1,2,2,2,2) 
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While all x € r' trivially satisfy x_i A xq A X\ = 0, i.e. belong to Mod2, this is not the case 
for all x 6 r". However, turning at the 6-th position the 2 to does the job. This yields the 
current working stack with the two rows labelled PC = 3; see top of Table 1. (Of course this 
"stack" has nothing to do with its namesake in "stack filter") As a general rule, the topmost 
row in the stack is always treated first ("last in, first out"). This may entail "local changes", or 
a splitting of the top row into several sons. In this way we proceed up to the second last stack in 
Table 1. Let us pick its top row r = (2, n, n, 2, 0, n, n, n, 2) and illustrate once more the splitting 
process. The intersection of the n-pool of r with (the index set of) the pending 6th conjunction 
is {-3,-2,1,2,3} n {-3, -2, -1,2,3,4} = {-3,-2,2,3}. Accordingly split r into the disjoint 
union of r' and r": 

r = (2, n, n, 2, 0, n, n, n, 2) 
r' = (2,n,n,2,0,2,n,n,2) 
r" = (2,1,1,2,0,0,1,1,2) 

Since r' C Mod 6 , r' is the first son of r. We have r" % Mod 6 , but r"nMod 6 = (2, 1, 1, n, 0, 0, 1, 1, n) 
becomes the second son. Both rows are final, i.e. are subsets of Mod and thus collected in a 
steadily increasing final stack. The working stack now contains three rows with pending con- 
junctions 6, 4, 3 respectively. In our case it just so happens that they are in fact already final (so 
e.g. all x in the row labelled PC = 4 happen to satisfy the 4th, 5th and 6th conjunction). The 
final stack comprises thus the five rows in Table 2 (for the moment ignore p 2 q 2 and so forth): 
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Table 2 



For instance, the second row in Table 2 contains 2 5 • (2 2 — 1) noncovers, where (2 2 — 1) comes 
from nn. The total number N of noncovers evaluates to 

N = 32 + 32 • 3 + 2 + 2 • 3 + 16 • 15 = 376 (6) 

which is much higher than the number R = 5 of final multivalued rows. 

Let us now calculate the output distribution. The first row in Table 2 contains 2 5 = 32 bitstrings 
x with b\(x) = 0. Each contributes some probability ct\ ct2 p q q p «3 04 05 to the sum in (3). 
Since each a« can independently be chosen to be p or q, the sum of these 32 terms is 

p 2 q 2 {ppppp + • • • + pqqpq + • • • + qqqqq) = p 2 q 2 {p + q) b = p 2 q 2 (7) 

The fact that e.g. nn = {00,01, 10} yields pp + pq + qp = 1 — q 2 , explains the contribution 
pq(l — q 2 ) of the second row. Similarly for the three other rows. Summing up the terms in Table 
2 yields 

4>s(p) = p 2 q 2 +pq-pq 3 +p 3 q 5 +p 2 q 4 -p 2 q 6 +p-pq 4 (8) 
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3 Theoretic and numeric assessment 



Suppose the constraint A* = {3,4} is to be imposed on the row r = (1,2, 1, 1) in the process 
of the noncover n-algorithm. Then r needs to be cancelled since no member X € r satisfies 
I 2 i*. Fortunately, with some precautions (not mentioned in section 2) the cancellation of 
rows can be avoided, which is essential in the Theorem below. 



Theorem: Suppose the stack filter S has window size w and its positive Boolean function 
b(x) is given as a disjunction of h conjunctions (DNF). Then the stack filter n-algorithm 
computes the output distribution of S in time 0(Nw 2 h 2 ). Here N is the number of 
bitstrings x with b(x) = 0. 



Proof. Let be a Horn-formula with w variables which is a conjunction of h Horn-clauses. As 
usual a Horn-clause is defined as a disjunction of literals at most one of which can be positive. 
In [10, Thm.2] it is shown how the Horn n-algorithm generates, along the lines of the previous 
example, all models of <j> as a disjoint union of {0, 1, 2, n}-valued rows in time 0(hw + Nh?w 2 ). 
If it is known beforehand that N > 0, then of course 0(hw + Nh 2 w 2 ) = 0(Nh 2 w 2 ). That's 
what happens here since our Horn-clauses (corresponding to the sets A\, ■ ■ ■ ,A^ from before) 
are fully negative, and so (0, 0, • • • , 0) (the empty set) is a model. It is clear that for each 
{0, 1, 2, n}-valued row its probability contribution (akin to (7)) can be calculated in time 0(w 2 ). 
This yields the claimed complexity 0(Nh 2 w 2 ) + 0(Nw 2 ) = 0(Nhw 2 ). ■ 

As detailed in [10], the Horn n-algorithm becomes the noncover n-algorithm (which morphs 
to the stack filter n-algorithm in the present article) when all clauses are fully negative, and it 
becomes the implication n-algorithm when no clause is fully negative. In all cases the occuring N 
in the complexity bounds could be substituted by the number R < N of final {0, 1, 2, n}-valued 
rows, which however is hard to predict. It can happen that R = N (even then e.g. 0(Nh 2 w 2 ) 
beats the 0(2 w hw 2 ) cost of searching the whole of {0, 1} W ), but as glimpsed in our example R 
is often orders of magnitudes smaller than N. 

It has been pointed out that b(x) may not initially be given in disjunctive normal form. However, 
if not, there are efficient methods to compute the DNF from any reasonable kind of presentation 
of b(x); this e.g. applies to the erosion - dilation cascades below. In any case, the bigger problem 
arguably is to find the bitstrings x G {0, 1} W with b(x) = 0. 

3.1 On binary decision diagrams 

Shmulevich et al. [8] proposed to evaluate (3) by setting up a binary decision diagram (BDD) 
for the Boolean function b(x) that underlies the stack filter S whose distribution transfer needs 
to be calculated. Suppose one has indeed spent time to get a BDD that represents b(x). While 
the number of models x € {0, 1} W with b(x) = can be determined fast from a BDD, it is 
more cumbersome to generate all models, as is forced by (3). True, from the BDD one can get 
the set of models as a disjoint union of {0, 1, 2}-valued rows in recursive fashion [l,p.22], but 
these rows are far more numerous than the ones produced by the stack filter n-algorithm; not 
surprisingly since our algorithm uses one additional symbol and hence more flexibililty in its 
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{0, 1, 2, ra}-valued rows. Finally, the enhancements discussed in section 4 are cumbersome to be 
handled by BDD's. 

3.2 Numercis exemplified on L£/L[/-smoothers 

Certain stack filters L n , their duals U n , and compositions thereof (called LULU filters) have 
been proposed in [7] and earlier, as alternatives to the popular median filters. For instance, as 
opposed to the latter, all LULU filters S are idempotent in the usual sense that S o S = S. 
Actually, the function &i(x_4, • • • ,£4) from section 2 is the PBF underlying U2L2. 

The natural definition of each LULU filter is as a cascade of so called erosions and dilations 
(CED), two dual concepts from Mathematical Morphology [7, III.C]. Computing the DNF of 
any CED essentially amounts to calculating CNF's and DNF's of successively bigger positive 
Boolean functions. For instance, C4 = LaJJ/^L^U^L2U2L\U\ is a CED stack filter with window 
size w = 46. Computing its DNF which comprises (exactly) 22'000 clauses took about four 
hourd3- Applying the stack filter n- algorithm to this DNF took only 63 seconds. The result is a 
degree 36 polynomial with coefficients as high as 2544p 28 . 

Due to the specific regularities of U n L n its DNF has in fact been discovered by other means [7, 
p. 112] and its distribution transfer was computed independent of its DNF in [3]; it equals 

(j) UnLn = l-q^ 1 -npq n+1 -pq 2n+2 - |(n - l)(rc + 2)pV n+2 (10) 

One verifies that (10) coincides with (8) for n = 2. Even the distribution transfer of C n := 
L n U n L n ~\U n -\ ■ ■ ■ L\U\ can be determined [3], albeit only by an efficient recursive formula as 
opposed to the closed form in (10). For all n < 5 the results agreed with the ones obtained with 
the stack filter n- algorithm, which is a strong indication that both methods are correct. 

Our algorithm (including the selection probabilities discussed in 4.1) will hopefully soon be 
available to any user of Mathematica in the form of a so called demonstration project. 



4 Three enhancements 

The stack smoother n-algorithm invites three upgrades that concern the so called rank selection 
probabilities (4.1), the calcuation of the joint distribution of two stack filters (4.2), and the 
elevation of to stack filters to balanced stack filters (4.3) . 

' Computing the DNF of a PBF from its CNF is a well researched topic [4], which also amounts to get all 
minimal transversals of a set system. The author used a refinement of the classic "Berge-algorithm" for the task. 
We do not claim that it competes with the cutting edge algorithms for DNF o CNF, but we feel that the stack 
filter n-algorithm is the right approach once the DNF is given. 
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4.1 Rank selection probabilities 



Let S be a stack filter. Given a sequence Z of independent identically distributed random 
variables, the so called rank selection probability pi is denned as the probability that a fixed 
component of the output series SZ is the i-ih smallest in the sliding window of length w. It is 
known, [6], [2, p. 236] that 

A w ~ i A w — 
Pi ~ 7 w \ ~~ 7 w y 
\w—i) \w—i+l) 

where A{ is the number of bitstrings x with i ones and w — i zeros that have b(x) = 0. The A^s 
can be conveniently calculated in tandem with the evaluation of (3). For instance, as the reader 
can easily verify, the contribution of the last row in Table 2 to Aq up to Aj is: 
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= 22 


A 7 : 


(S(S 


= 4 



We mention that in [5] the optimization of stack filters with respect to certain constraints leads 
to specific desirable values of Ai, ■ ■ ■ , A w . Finding a stack filter S that features these values (at 
least approximately) is however hard. The author plans to compile a catalogue of CED's (see 
3.2) with corresponding vectors (A\, • • • , A w ) from which a suitable candidate S can be picked. 

4.2 The joint output distribution of two stack niters 

Let Z be a doubly infinite sequence of i.i.d. random variables. For two stack filters S and T 
with corresponding positive Boolean functions b\{x) and 62 (y) their joint output distribution 
Fsz,Tz(s,t), or simply JD(s,t), is defined as 

JD(s, t) := Prob((S"Z) < s and (TZ) < t) 

If we set p := Prob(Zo < s), it := Prob(Zo < t) and assume p < ir (the case p > it is similar) 
then it is shown in [2, p. 230] that 

w w 

JD(s,t) = Y.zZ A ^P i ^-P) W - i - j ^-^) j ^ (11) 

i=0 j=0 



7 



where Aij is the number of (x,y) G {0, 1} W x {0, 1} W such that 

x > y, bi(x) = b 2 (y) = 0, v__(a:,y)=i, u +)+ (x,y)=j, 

and wherd^ 

v--(xi,--- ,x w ,yt,--- ,y w ) := \{1 < k < w : x k = y k = 0}| 

v +t+ (xi, ■ ■ ■ ,x w ,yi, ■ ■ ■ ,y w ) := \{1 < k < w : x k = y k = 1}| 

The calculation of the coefficients A^ works row-wise. So suppose r below is one of the final 
rows obtained after applying the noncover n-algorithm to b\. Obviously the set 

T := {y : (3x G r) x > y} 

is represented by row r$- If say bi (y) = ys A yg A y\o then the set 

F{b 2 ) := {y £ T : b 2 (y) = 0} 

is the disjoint union p\ U p 2 U p^: 
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Table 3 

For each x € r and A; £ {1,2,3} one now records v (x, y) and u+ _)-(a;, y) for all y £ p k with 

y < x. For instance, taking the x indicated in Table 3 one verifies that 

where the later union is disjoint (see n 2 n 2 in p% and the corresponding boldface entries in a, r). 
It is easy to see that a contributes an amount of Q^j to the value of A4 j for all < j < 5. 

Similarly r contributes an amount of ( .) to the value of A4 j + i (0 < j < 4). Calculations can be 
sped up by clumping together suitable x's rather than processing them one by one. We discuss 
a similar phenomenon in more detail in the next section. 

4.3 Balanced stack niters 

In [9] the concept of a kt/ance(|§ stack filter S is introduced. Suffice it to say that S is based on 
"mirrored thresholding" (which entails t and —t to play symmetric roles). Most important for 

''Since the letter w is occuped we use and «+,+ rather than u>_,_ and as in [2]. 

§ Actually, Arce, Paredes and Shmulevich propose to reserve the term "stack filter" to their new concept, and to 
relabel the "old" stack filters as stack smoothers. As suggested by one referee, we stick to the old, well established 
terminology. 
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us, S is based again upon a PBF albeit in a manner more sophisticated than (1). For instance, 
the PBF is of the kind b(x, y) = b(x\, • • • , x w ,y\, • • • , y w ), and in this set up a stack filter turns 
out to be a balanced stack filter where b does not depend on yi, ■ ■ ■ , y w (i.e., these variables are 
fictitious). As usual let Z be a doubly inifinite sequence of i.i.d. random variables with common 
cumulative distribution function Fz(t) = Prob{Zi < t) (j £ Z). Put F(t) = Fz(t) and 



p- 



p- 



P4 



F(-t) - F(t) if t < 
if t > 

if t < 
F{t) - F{-t) if t > 

F(t) if t < 
F(-t) if t>0 

1 - if t < 
1 - F(t) if t > 



Besides and V--(x,y) from 4.2 we also put 

v_ t+ (x,y) := \{1 < k < w : x k = and y k = 1}| 



u + _(a?,y) := |{1 < k < w : x k = 1 and y k = 0}| 

Modulo some obvious typos, it is shown in [9, (17)] that the output distribution, i.e. Fsz(t) = 
Prob((SZ)o < t), can be calculated as 

F S z{t) = 2^ p+x v ■ p^l ■ p_x v ■ v~;~ 

b(x,y)=0 

As opposed to JD(s,t) in (11), which is a polynomial of Prob(Zo < s) and Prob(Zo < t), here 
Fsz(t) is not quite a polynomial in terms of Prob((S Z)q < t) and Prob((SZ)o < —t). 



Nevertheless the noncover n-algorithm is of good use. Suppose it has (among others) returned 
the final row r below. Take any bitstring x* = (x%, . . . ,xg) "contained" in the left hand side 
(rii, n,2, n$, 1, 77.4, rt4, 0, 2, 77.3) of r. More precisely, any bitstring x* which is extendibl^ to a 
bitstring (X*,y) 6 r. Say x* = (1,1,1,1,1,0,0,0,0). For each fixed k G {0, 1, • • • 5} and 
k' £ {0, 1, • • • , 4} we now show how the number f(k, k') of bitstrings y = (y%, ■ ■ ■ , yg) with 

7j +i+ (x*,2/) = k and v_ + (x*,y) = k' 

(whence v+-(x*, y) = 5 — k and ^-(x*, y) = 4 — A; ) 
can be calculated fast. First, notice that the subset 

r(x*) := {(x,y)€r: x = x*} 

of r can be written as multi-valued row as shown in Table 4. 

^It is easily seen that the extendible bitstrings are exactly the members of (2, 2, 2, 1, ru, 714, 0, 2, 2). 
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Problem is we cannot freely choose k l's among {yi, - ■ ■ ,ys} and k' l's among {y6)"")?/9} 
because e.g. the choice (1,1,0,0,0,1,1,0,0) clashes with n\n\n\ri\. But when one partitions 
r(x*) as ri U r2 U r3 U as indicated, then for each the choices within {y±, ■ ■ ■ ,7/5} respec- 
tively {ye, • • • ,yg} can be made independently. To fix ideas, say k = 2 and k' = 3. Then the 
contribution of r(x*) = n U T2 U r3 U r4 to the coefficient of the monom 

occuring in F$x {t) is 

f{k,k!) = 8-3 + 1-2 + 1-2 + 0-0 = 28. 



Generally, the number of bitstrings with a fixed number k of l's that are contained in a 
{0, 1, 2, n}-valued row can be determined fast. Similar to 4.2, but more obvious, time can be 
saved by clumping together suitable bitstrings (xi, • • • , xg). For instance, (1, 1, 0, 1, 0, 0, 0, 1, 1) 
causes the same right hand side (ni, n\, 2, n,2, ri2, ni, n\, ri2, 112) as did x*. As another example, 
(0, 0, 1, 1, 1, 0, 0, 0, 0) is one among ten left hand sides of weight 3 that cause the right hand side 
(2,2,2,2,2,2,2,2,2). 
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